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Abs tract 


'  Diagnostics  based  on  robust  R-estimates  of  regression  coefficients  are 
developed.  These  methods  are  not  as  sensitive  to  influential  points  as  least 
squares  diagnostics.  In  data  sets  with  several  influential  points, 
diagnostics  based  or.  a  robust  fit  have  a  greater  chance  of  detecting 
interesting  cases  for  further  inspection.  Robust  analogues  of  the  internal 
and  external  t  statistics,  DFFITS,  DCOOK,  and  DFBETAS  are  developed  and 
illustrated  on  two  data  sets.  '  -  • 


Keys  words:  Linear  Models,  Robustness,  Regression. 
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1 .  Introduction 

A  regression  model  is  at  best  an  approximation  to  the  reality  of  the 
situation  under  study.  Regression  diagnostics  are  invaluable  tools  for 
detecting  data  points  at  which  the  model  and  the  data  differ  greatly  (such 
points  are  called  outliers)  as  well  as  data  points  which  have  a  large 
influence  on  the  model .  In  the  last  ten  years  there  has  been  much  interest 
in  the  area  of  regression  diagnostics.  A  testament  to  this  is  that  a  number 
of  diagnostics  are  currently  available  in  all  the  major  statistical  computing 
packages.  Regression  diagnostics  are  discussed  in  detail  in  the  books  by 
Cook  and  Weisberg  (1982)  and  Belsley,  Kuh  and  Welsch  (1980)  and  in  the  review 
articles  by  Hocking  (1983),  Chatterjee  and  Hadi  (1986),  and  Hettmansperger 
(1987). 

It  is  well  known,  however,  that  a  few  influential  points  can  spoil  the 
least  squares  fit  of  a  linear  model.  In  data  sets  with  several  influential 
points,  some  of  these  points  can  exert  such  a  strong  influence  on  the  least 
squares  fit  that  other  influential  points  are  masked  and,  hence,  are  not 
detected  by  these  diagnostic  procedures.  The  date  sets  discussed  in  Section 
4  are  illustrations  of  this  effect.  Examination  of  date  sets  containing 
influential  points,  based  on  estimates  that  are  impaired  by  such  points,  is  a 
serious  drawback  to  diagnostics  based  on  least  squares.  In  these 
circumstances,  the  traditional  diagnostics  suffer  a  lack  of  detection  power. 

Over  the  last  ten  years,  the  area  of  robust  regression  has  also  become  a 
rapidly  expanding  field.  Some  of  the  major  statistical  packages  now  contain 
some  form  of  robust  regression.  Using  a  robust  fitting  method  reduces  the 
effect  of  influential  points  on  the  fitted  model.  However,  a  number  of 
authors  point  out  that  the  exclusive  use  of  robust  methods  can  obscure 
important  substantive  problems  with  the  model  which  in  some  situations  are 
revealed  by  regression  diagnostics  based  on  least  squares;  see  Cook  (1986) 
and  Chatterjee  and  Hadi  (1986). 

In  this  papier  we  develop  diagnostics  based  on  r~bur*  R-estimat.es  of 
regression  coefficients.  Similar  methods  can  be  used  to  develop  diagnostics 
based  on  other  classes  of  robust,  estimates.  These  estimates  arc  nor  as 
sensitive  to  influential  points  as  least  squares  and  the  resulting  diagnostics 
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appear  to  be  more  powerful  than  the  least  squares  based  methods.  In  data  sets 
with  several  influential  points,  the  diagnostics  based  on  the  robust  fit 
therefore  have  a  greater  chance  of  detecting  influential  points  than  those 
based  on  least  squares.  In  Section  3,  we  develop  robust  analogues  of  the 
internal  and  external  t  statistic,  DFFITS,  DCOOK,  and  DFBETAS.  As  Sections  2 
and  3  demonstrate,  their  geometry  is  quite  similar  to  their  least  squares 
counterparts.  The  last  four  techniques  measure  the  impact  of  an  individual 
case  on  the  robust  fit.  In  the  examples  of  Section  4,  these  four  techniques 
are  able  to  detect  some  obvious  outliers  whereas  the  same  techniques  based  on 
least  squares  are  not.  The  robust  diagnostics  can  thus  be  used  to  flag 
potential  cases  of  trouble  and  should  serve  as  quite  useful  tools  in  linear 
mixiel  fitting. 

Tn  Appendix  A,  we  present  a  unified  development  of  some  of  the  more 
useful  least  squares  diagnostics.  The  derivations  are  based  on  the  mean 
shift  outlier  model;  see  Cook  and  Weisberg  (1982,  p.20). 


2.  Notation  and  R-Estimates 
Consider  the  linear  model 

Y  =  «I  +  +  e  (2.1) 

where  1  denotes  an  n*l  vector  of  ones,  XQ  is  an  nx(p-l)  centered  design  matrix 
having  full  column  rank,  a  is  an  intercept  parameter,  £  is  (p-l)xl  vector  of 
parameters,  and  e  is  an  ml  vector  of  random  errors  whose  components  are 
L.i.d.  with  distribution  function  F  and  density  f.  Letting  X  =  [_1:  X  ]  and  b 
=  (o,g' )' ,  we  can  write  the  model  as 

Y  =  Xb  +  e. 


Discussions  of  R-estimates  for  this  linear  model  cam  be  found  in 
Hettmansperger  and  McKean  (1977).  Briefly,  consider  Jaeckel’s  (1972) 
dispersion  function  which  is  given  by 


D(R)  =  Ta!R(Y  '  P  } }  (Y 

1  -C 1  *-  '  '  X 


<  /  . 


where  xf>^'  is  the  ith  row  of  Xc,  R(up  denotes  the  rank  of  u^  among  uit...,u 
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and  (a(i)l  is  a  set  of  scores  which  are  generated  as 

a(  i  )  =  U>(^y)  (2.3) 

where  '4!  is  a  nondecreasing  function  defined  un  (0,1)  such  that  /cp(u)du  =  0 
and  Jip2(u)du  =  1.  Examples  of  such  score  functions  are  the  Wilcoxon,  14  (u)  = 

yTZfu-^),  and  the  sign  scores  <p(u)  =  sgnfu-g). 

Jaeckel  (1972)  showed  that  D  is  a  continuous,  convex  function  of  £  and 

proposed  estimating  £  by  £R  where  D(£R)  =  min  D(£  ).  McKean  and 

£ 

Hettmansperger  (1976)  proposed  testing  subvectors  of  £  by  using  the  reduction 
in  D(£)  due  to  fitting  the  full  and  reduced  models.  Algorithms  for  obtaining 
can  be  found  in  McKean  and  Hettmansperger  (1978)  said  Osborne  (1985). 

Version  6  of  MINITAB  contains  coirenands  which  return  £R. 

Under  mild  regularity  conditions,  found  in  Heiler  and  Willers  (1979), 
satisfies 

iR  =  l  +  •  (Xc'Xc)-iXc'a(R(e) )  +  op( 1 )  (2.4) 

where  a(R(e))  denotes  the  vector  with  components  a(R(e^))  and  r  is  a  scale 
parameter  defined  by 

T  =  /<p(u>(-  f'  ) du .  (2.5) 

f ( F  Mu)) 

Discussions  of  consistent  estimates  of  r  based  on  the  residuals  Y  -  X^^  can 

be  found  in  Koul  et  al.  (1987)  and  Aubuchon  and  Hettmansperger  (1988).  Under 
these  regularity  conditions  X^fRfe))  is  approximately  N  1(0,Xc'Xr);  hence, 

£R  is  approximately  N <£ > yZ <xc' XQ ) _1 ) .  (2.6) 

Note  if  XQ  -  [X1C|X2C]  and  X1C  and  X2C  are  orthogonal,  i.e.  Xi^X2C  -  0, 

then  and  £2R  are  asymptotically  independent.  While  this  does  not  imply, 
in  finite  samples,  that  the  R-estimat.es  of  £  are  the  same  in  the  reduced  and 
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full  models,  we  have  found  in  practice  that  the  estimates  do  not  differ  by 
very  much.  For  use  in  Section  4,  we  will  term  this  approximate  orthogonality. 

Also,  since  ranks  are  invariant  to  constant  shifts,  the  intercept 
parameter  cannot  be  estimated  from  D(£).  If  symmetry  of  the  error 

distribution  seems  to  be  a  tenable  assumption,  then  scores  satisfying 

tp(l— u)  =  — <p  ( u )  (2.7) 

are  suitable;  for  instance,  both  the  Wilcoxon  and  sign  scores  cited  above 
satisfy  this  condition.  Then  the  intercept  can  be  estimated  by  using  a  one 
sample  location  R-estimate  which  corresponds  to  the  chosen  score  function, 

see  McKean  and  Hettmansperger  (1978).  Let  denote  this  estimate  and  let 
=  (ot^.gp)' .  Under  regularity  conditions  which  include  symmetry  of  the 

distribution  of  the  errors,  is  approximately  Np(b,r2(X'X)  1). 

If  symmetry  of  the  error  distribution  is  not  tenable  then,  to  avoid  its 
assumption,  we  take  the  intercept  to  be  the  median  of  the  distribution 

F(y-x'g)  and  estimate  the  intercept,  by 
“*  =  ">ed{Yi-xi;iR}. 

Under  the  same  regularity  conditions  as  cited  for  (2.4),  we  have 

a*  -  a  +  r*  i  i'a*(e)  +  Op( 1 )  (2.8) 

where  r*  =  (2f(0))~l  and  a*(e^)  =  sgn(e-).  Estimation  of  is  discussed  by 
McKean  and  Schrader  (1984).  It  then  follows  that 


is  approximately  N  ( ) , 


r*Z  O' 

o  t2(x;xc)_1 


3.  R-Diagnostics 

3.1.  Internal  R-studentized  residuals. 

Similar  to  the  least  squares  residuals,  the  variance  of  the  R-residuals, 
e^  depend  on  both  the  linear  model  and  the  underlying  variation  of  the 

errors.  The  internal  studentized  least  squares  residuals,  see  Appendix  A, 
have  proved  useful  in  diagnostic  procedures  since  they  correct  for  both  the 
model  and  the  underlying  variance.  The  internal  R-stijdentized  residuals 
defined  below,  (3.5),  are  similarly  standardized  R-residuals. 

*  %  * 

As  discussed  in  Section  2,  let  a  and  £ ^  denote  the  R-estimates  of  a  and 
£.  Denote  the  residuals  by  eR  =  Y  -  a*_l  -  In  order  to  standardize 

these  residuals  we  need  an  estimate  of  the  variance-covariance  matrix 


Cov(eR).  From  Appendix  B,  equation  (B.9),  we  have  the  approximation 


cR  ~  (Y-ia-X^)  -  -V  -  tH. 


(3.1) 


♦  t  % 

where  a  and  a  denote  the  vectors  a  (e)  and  a  (R(e))  given  in  the  expression: 

(2.4)  and  (2.8).  Throughout  this  paper  =  refers  to  first  order 
approximations  as  developed  in  Appendix  B.  As  shown  in  (B.l)  of  Appendix  B, 

an  estimate  of  the  variance-covar Lance  matrix  of  eR  is 

S  =  a2(I-KiJ-K2Hc)  (3.2) 

where  =  (~*/o  )2  (  (  2<5  */t*  ) -1  ) 

K2  =  ( r/a  )2  (  (  2<5 / t-  )  —  1 ) 


;ind  15  =  ^  D(£r>* 

*  %  *  A 
The  estimators  r  and  r  are  discussed  in  Sect.ion  2  and  D(/?R)  is  defined  by 


(2.2)  . 


-R- 


To  complete  the  estimate  of  the  Cov(e^)  we  need  an  estimate  of  n‘  .  One 

possibility  is  to  use  the  least  squares  estimate  o’,  This  is  a  consistent 
estimate  provided  the  errors  have  finite  variance.  There  are  other 
possibilities  but  they  involve  assumptions  on  the  form  of  the  distribution; 


for  example,  t6  is  a  consistent  estimate  provided  the  errors  have  a  norma] 
distribution.  For  robustness,  a  mildly  trimmed  or  winsorized  mean  square 
error  could  be  used,  see  Shoemaker  and  Hettmansperger  (1982). 


It  follows  from  (.3.2)  that  an  estimate  of  Var(eD  •)  is 

rt ,  1 


!,  i  =  °8n  -  Kt  i  -  Kahic), 


(3.3) 


where  h- 


=  SicW^ie. 


Note  that  in  the  least  squares  case  sT „  2  =  02(l-h-)  and  h-  =  n  1  +  h-  the 

ith  diagonal  element  of  the  least  squares  projection  matrix,  which  is  the  ith 

leverage  value.  Hence  and  K£  can  be  viewed  as  corrections  due  to  using  the 

rank  based  fitting  method.  If  the  error  distribution  is  symmetric  (3.3) 
reduces  to 


=  a2(l-K2hi). 


(3.4) 


We  define  the  internal  R-studentized  residuals  as 


i  =  1 ,  . .  .  ,  n 


where  sR> ^  is  the  square  root  of  either  (3.3)  or  (3.4)  depending  on  whether 

one  assumes  an  asymmetric  or  symmetric  error  distribution,  respectively. 

As  with  their  least  squares  counterparts,  we  think  the  chief  benefit  of 
the  internal  R-studentized  residuals  is  t.heir  usefulness  in  diagnostic  plots, 
such  as  plots  of  residuals  versus  fit. ted  values  and  q  -  q  plots.  These 
residuals  are  corrected  for  both  the  design  and  the  underlying  variance. 

Tt  is  interest. ing  to  compare  expression  (3.4)  with  the  estimate  of  the 

variance  of  the  least,  squares  residual,  <(2(l-h^).  The  correction  factor 
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depends  on  the  score  function  (  •  )  and  the  underlying  symmetric  error 
distribution.  If,  for  example,  the  error  distribution  is  normal  and  if  we  use 

normal  scores,  then  K  converges  almost  surely  to  1.  In  general,  however,  we 

will  not  wish  to  specify  the  error  distribution  and  then  K2  provides  a  natural 
adjustment. 

)  3.2.  R-estimates  for  the  mean  shift  outliers  model. 

The  R  diagnostics  that  follow  depend  on  the  mean  shift  outlier 
model  which  is  discussed  in  detail  in  Appendix  A.  Briefly,  for  the  ith  case, 
the  mean  shift  outlier  model  is 

Y  =  Xb  +  di0i  +  e  (3.6) 

where  d^  is  an  nxl  vector  of  zeroes  except  for  its  ith  component  which  is  1. 

A  formal  test  that  the  ith  point  is  an  outlier  involves  testing  the 
hypotheses  Hq:  0^=0  versus  H^:  0^0. 

Below  we  obtain  an  R-estimate  of  0^  and  an  estimate  r(i)  of  t, 

based  on  the  model  (3.6).  These  estimates  will  play  a  key  role  for  the 
R-diagnostics  that  follow. 

One  way  of  obtaining  an  R-estimate  of  0^  involves  fitting  this  model. 

Thi  would  be  computationally  expensive  since  n  such  models  need  to  be  fit. 
Another  way  would  be  to  consider  aligned  i~ank  procedures.  These  procedures 
remove  the  effects  of  nuisance  parameters  ( in  this  case  b)  by  considering  the 

residuals  from  the  reduced  model  (in  this  case  e^  from  the  reduced  model  Y  = 

Xb  +  e);  see  Puri  and  Sen  (1985)  for  a  discussion  of  aligned  rank  procedures. 

It  is  convenient  to  use  the  second  form  of  the  mean  shift  outlier  model 
(A. 3)  given  by  Y  =  Xb*  +  d*0^  +  e,  where  d*  =  (I-H)d^,  and  H  is  X(X'X)  1X' . 

In  this  form  X  and  d*  are  orthogonal  and  McKean  (1975)  has  shown  that  this 

helps  eliminate  bias  in  the  estimates.  This  is  the  model  Cook  and  Weisberg 
(1982)  used  in  obtaining  the  least  squares  external  t  diagnostic.  Note  that 


i 
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the  first  part  of  the  model  Xb 


* 


is  a  vector  in  the  column  space  of  X. 


Hence 


the  R-residuals  from  the  fit  of  this  reduced  model  are  still  eR. 


Our  R-estimate  of  0^ 


0D  •  which  is  a  solution  of  A- (9-) 

K  f  X  11 


0  where 


Ai(V 


=  2  d.*a(R(eR  • 
j=l  U  «»J 


-0idij)  )  • 


(3.7) 


Thus  the  problem  has  been  reduced  to  finding  n  simple  regressions. 
Furthermore  these  regressions  are  easily  obtained.  If  we  view  the  LHS 
of  (3.7)  as  a  function  of  0^,  it  crm  be  shown  that  it  is  a  decreasing  step 

function  of  0^.  The  solution  follows  quickly  using  a  simple  linear  search 

routine.  A  procedure  which  works  quite  well  is  the  Iltinois  version  of 
regula  falsi  similar  to  the  algorithm  discussed  by  McKean  and  Ryan  (1977). 

The  R-residuals  from  the  fit.  of  the  second  form  of  the  mean  shift 
outlier  model  (A. 3)  arc 


% 


-R 


0R, i-i 


(3.8) 


Define  r(i)  and  r  ( i )  as  the  estimates  of  r  and  r  based  on  the  residual 
vector  eR. 


Note  that  if  we  replace  the  above  rank  criterion  by  the  least  squares 
criterion  then  we  obtain  the  least  squares  estimate  of  0^  by  using  a  series 

of  simple  regressions  to  find  a  multiple  regression;  see  Draper-  and  Smith 
(1981,  p.204). 


3.3.  RDFFIT. 

Next  we  consider  a  statistic  that  measures  the  first  order 
change  in  the  R-fit  of  the  ith  case  when  the  i th  case  is  deleted.  As  in 


Appendix  B,  the  first  order  terms  in 
when  the  ith  case  is  deleted  is 

RDFFIT •  =  YR(i  -  YR(r) 


the  change  in  the  R-fit  of  the  ith  case 


(3.9) 
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Equation  (3.9)  can  be  developed  as  follows:  For  the  ith  case,  consider  the 
second  formulation  of  the  mean  shift  outlier  model  given  by  (A. 3). 


Appealing  to  tne  asymptotic  orthogonality,  Y^  ^ ,  the  R-fit  of  Y }  in  the 

original  model  (A.l),  is  the  R-fit  of  the  first  term  on  the  RHS  of  model 

(A. 3)  and  0O  d*  is  the  R-fit  of  the  second  term  on  the  RHS  of  model  (A.3). 
K ,  I  11 

Hence  the  R-predicted  value  of  Y^  in  the  mean  shift  outlier  model  is,  to  the 


first  order,  YD  ■  +  9q  ;d- ■ ,  which  can  be  expressed  as 

lb  <1  A  |  1  11 


YR)1  +  *R,!dii  [YR,i-eR,i(H4i>iJ  +  0R(i 


The  term  in  brackets  is,  of  course,  the  R-fit  of  the  first  term  on  the 
RHS  of  the  first  formulation  of  the  mean  shift  outlier  model,  namely  (Xb).  of 

model  (A.2).  As  noted  in  Appendix  A,  when  least  squares  methods  are  user],  the 
least  squares  fit  of  this  term  is  Y( s  ( i  )  .  Similar  to  least  squares,  the 
bracketed  term 


YR( 1 >  “  Y R, 1  °R , i h i 


( 3 .10) 


Clearly,  in  order  to  be  useful,  RDFFIT ]  needs  to  be  assessed  relative  to 
some  scale.  The  following  R-diagnostics  are  formulations  of  RDFFITSj  based  «>n 
appropriate  scales. 


3.4.  RDCOOK  and  RDFFTTS. 


RDFFIT  is  a  change  in  the  fitted  value;  hence,  a  natural  scab*  for 
assessing  RDFFIT  is  a  fitted  value  scale.  it.  follows  from  Appendix  R,  see 
(B.5)  and  (B.6),  that  for  the  R-fit,  assuming  an  asymmetric  error 
dist.ri but  i on , 


Var < Yd  • )  =  -  r *2  +  h ■  r* 
R,  i  n  ic 


Hence,  based  on  a  fitted  scale  assessment,  we  standardize  RDFFTT  by 
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As  noted  in  Appendix  A,  for  least  squares  diagnostics  there  is  some 
discussion  on  whether  to  use  the  original  model  or  the  mean  shift  outlier 
model  for  the  estimation  of  scale.  Cook  and  Weisberg  (1982)  advocate  the 
original  model .  In  this  case  the  scale  estimate  is  the  same  for  all  n  cases. 
This  allows  casewise  comparisons  involving  the  diagnostic.  Belsley,  Kuh,  and 
Welseh  (1980),  however,  advocate  scale  estimation  based  on  the  mean  shift 
outlier  model.  Mote  that  both  standardizations  correct  for  the  model  and  the 

underlying  variation  of  the  errors. 

A+  A 

Let  r  and  r  denote  the  estimators  of  r  and  r  discussed  in  Section  2. 
Our  diagnostic  in  which  RDFFIT-  is  assessed  relative  to  a  fitted  value  scale 

with  estimates  of  scale  based  on  the  original  model  is  given  by 


(p  RDCOOK i ) 1 


RDFFIT L 

iFaU>.hicA)Wa' 


This  is  an  R-analogue  of  (p  DCOOK^)1^2  statistic  proposed  by  Cook  and  Weisberg 
(1982),  see  (A. 9). 

Let  r  (i)  and  r(i)  denote  the  estimates  of  r  and  r  for  the  mean  shift 
outlier  model  as  discussed  above.  Then  our  diagnostic  in  which  RDFFTT^  is 

assessed  relative  to  a  fitted  value  scale  with  estimates  of  scale  based  on  the 
mean  shift  outlier  model  is  given  by 

RDFFIT- 


RDFFITS  -  = 

i-  i  1  *2 


„•  lu*hioTali))‘/a' 


(3.11) 


This  is  an  R-;malogue  of  the  least  squares  diagnostic  DFFITS;  proposed  by 
Belsley  et.  al  .  (1980);  see  (A.  10)  of  Appendix  A. 

If  the  I'rror  distribution  is  assumed  to  be  symmetric,  the  R-di  agues  t.  i  es 
are  obtained  by  replacing  Var(Yp  j)  wit.h 

Var(YR,  i 1  -  hr  2  ’ 

see  (B.6)  of  Appendix  B.  This  eliminat.es  the  need  to  est  imate  r* . 
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There  is  disagreement  on  what  cutoff  values  to  use  for  flagging 
points  of  potential  influence.  As  Belsley  et  al .  (1980)  discuss  in  some 
detail,  DFFITS  is  inversely  influenced  by  sample  size.  They  advocate  a 

size-adjusted  cutoff  value  of  2*/p/n  for  DFFITS,  which  would  lead,  to  a  2/v'n 

cutoff  value  for  (DCOOK)1//a.  Cook  and  Weisberg  (1982,  p . 118)  suggest  a  more 
conservative  cutoff  value  of  1.  In  the  examples  in  Section  4,  we  will  use  the 
more  liberal  value,  realizing  these  diagnostics  are  only  flagging  potential 
influential  points  that  require  investigation,  ,4s  with  the  two  references 
cited  above,  we  would  never  recommend  indiscriminant  deleting  of  observations 
solely  because  their  diagnostic  values  exceed  the  cutoff  point.  Rather  these 
are  potential  points  of  influence  which  should  be  investigated. 


3.5.  External  tR-stat.istic . 

The  above  diagnostics  RDtiOOK  and  RDFITS,  assess  the  first  order 
change  in  RDFFIT  relative  to  the  R-fitt.ed  scale  which  is  a  z -scale,  (or  a  - 

% 

and  t  scale  under  the  assumption  of  an  asymmetric  error  dist.ribut Lon ) .  This 


change  in  fit,  however,  is  proportional  to  9R  ^ .  Hence  assessing  0R  ^  on  the 

r-scale  is  consistent  with  the  scale  suggested  by  the  approximate  distribution 
of  an  R-estimate,  see  (2.6). 

Note,  that  in  the  mean  shift  outlier  model  the  leverage  value  of  the  i  t.h 
case  is  1 .  As  Huber  (1981)  showed,  a  necessary  and  sufficient  condition  for 
the  least  squares  estimat.es  to  be  asymptotically  normal  is  that  the  leverage? 
values  go  to  zero  uniformly.  Similarly,  this  is  a  sufficient  condition  for 
the  asymptotic  distribution  theory  of  the  R-est  imat.es .  Therefore  the 
asymptotic  theory  For  neither  least  squares  nor  R-estimales  hold  for  the  mean 


shift,  outlier  model.  Nevertheless 

relative  to  its  standard  error),  si 
diagnostic  for  least,  squares  fit. 


the  external  t-statistic,  t^g(i),  (0^. 
(A. 8),  has  proved  to  be  an  effective 


In  analogy  to  the  external  t^g-stat. i st. ic,  we  propose  the  external 


t.R-st.at, i st  i e  which  is  given  by 


Although  this  is  the  standardization  suggested  by  the  asymptotic  distribution 
theory,  in  light  of  the  above  discussion,  we  do  not  propose  it  as  a  test  for 
Hq.'  0=0  versus  H^:  &*0.  Instead  we  propose  it  as  an  alternative  to  the  least 

squares  diagnostic  ( i  >  -  We  are  still  assessing  the  change  RDFFIT  on  a 

r-scale.  We  further  feel  that  r  is  a  more  robust  estimate  of  ;  than  a  is  of 

o  and  we  have  found  in  practice  that  it  appears  to  be  better  at  flagging 
potential  points  of  influence  than  tj  q( i ) . 

3.6.  RDFBETAS . 

When  the  diagnostics  RDFFITS  or  RDOOOK  are  large  for,  say,  the  ith  case, 
then  we  usually  want  to  investigate  the  impact  ihis  case  has  on  the 
individual  regression  coefficients.  Thus,  we  want  to  consider  the 
statistic  we  shall  define  as 

RDFBETAi  =  bp  -  bp(i) 

where  bp  ^  is  the  R-estimate  of  b  in  the  mean  shift  model  (A. 2). 

In  order  to  obtain  this  statistic,  first  note  that  if  Yp  is  the 
R-predieLiun  of  Y  in  the  original  model ,  then  the  R-estimate  of  b  is  the 

solution  bp  to  the  equation 
=  -R 

that,  is, 

bp  =  (X'X)"lX,YR.  (3.13) 

In  fact,  most  modem  software  obtains  bp  by  first  finding  Yp  employing  a 

convenient  basis  matrix  of  X;  see,  for  example,  Hettmansperger  and  McKean 
(1983,  Section  4). 


Let  x  =  [X|d^j  denote  the  design  matrix  for  the  mean  shift  outlier 
model.  Let  VD  •  denote  the  R-fitted  value  of  this  model.  Then  according  to 

— K ,  1 

(3.13)  bpj(i)  is  the  first  p  coordinates  of  the  vector, 

(x '  X  )~l\ '  XR>  i  • 

From  Section  3.2,  (3.10),  YD  •  =  YD  +  0D  -d*.  Then  using  the  result  for  the 

”K  f  1  — K  rx  ,  1—  1 

inverse  of  a  partitioned  matrix  (see  p.27  of  Searle  (1971))  and  the  fact  that 
£ 

d-  =  (I-H)d-,  we  obtain  after  some  algebra  that 

feft(i)  =  *  (X'Xl'Vi^R.i 

where  is  the  ith  row  of  Hence 
RDFBETA-  =  (X'XTVe  - . 

1  1  XV  |  1 

To  be  useful  RDFBETA^  needs  to  be  measured  relative  to  a  scale.  Since 

it  is  proportional  to  a  difference  in  fitted  values  we  shall  choose  a 
r-scale.  ,4s  in  Section  (3.4)  if  r  is  estimated  by  using  the  mean  shift 
outlier  model.  Then  the  diagnostic,  defined  for  the  jth  component  of 
RDFBETA j ,  is 

RDFBETAS  •  .  =  RDFBETA  •  /  ( r  (  i  )y  ( X' X  )  7l. ) 

1  >  J  L  C  C  J  J 

Belsley  et  al .  (1980)  advocate  a  size  adjusted  cutoff  value  of  2/Vn  for  the 
corresponding  least  squares  diagnostic. 

These  diagnostics  are  straightforward  to  compute.  Consider  the  diagonal 

matrix  =  diag{0^  j,...,0^  n) .  Define  (p-l)xn  matrix 

RDFBETA  =  [bp— Bpj ( 1  )  ....  bp-^fn)]. 

It  then  follows 

RDFBETA  =  ( \”X'C )  ~ 1  X',Gr  . 

Mote  that  each  of  the  n-columns  of  RDFBETA  is  simply  a  least  squares  fit  of  a 
column  of  G.  They  can  be  obtained  quickly  using  the  QR-subroutines  in 
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LINPACK;  see  Dongara  et  al .  (1979). 


4.  Examples 

The  following  two  examples  illustrate  the  power  of  the  R-diagnostics 
in  detecting  influential  points  in  linear  models.  The  R-estimates  were 
computed  by  the  algorithm  discussed  in  Hettmansperger  and  McKean  (1983).  To 
compute  the  diagnostics  we  used  the  UNPACK  subroutines  SQRDC  and  SQRSL  for 
the  numerical  linear  algebra  parts,  such  as  leverage  values,  projections,  etc. 
The  R-estimate  of  0^  was  computed  as  discussed  in  Section  3.  The  parameter  * 

was  estimated  as  discussed  in  Koul  et  al .  (1987)  using  the  value  of  a  =  .80. 
Example  1 .  The  data  for  this  example  can  be  found  in  Morrison  (1983,  p.64). 

The  response  is  the  level  of  free  fatty  acid  of  prepubescent  boys  while  the 
independent  variables  are  age,  weight,  and  skin  fold  thickness.  The  sample 
size  is  41.  Figure  1  depicts  the  residual  plot  based  on  the  least  squares 
fit.  From  this  plot  there  appears  to  be  several  outliers.  Certainly  the 
points  12,  22,  26  and  9  are  outlying  and  perhaps  the  points  8,  10  and  38.  In 
fact,  the  first  four  of  these  points  probably  spoiled  the  least  squares  fit, 
obscuring  the  points  8,  10  and  38.  This  seems  apparent  from  the  residual 
plot  based  on  the  Wileoxon  fit,  Figure  2,  where  all  seven  points  stand  out. 
Table  1  gives  the  values  of  the  internal  t,  external  t,  DFFITS  and 

(DCOOK)  '  ,  diagnostic  statistics  for  both  the  least  squares  and  Wileoxon  fit. 

Using  a  cutoff  value  of  2  for  the  external  t  statistics  and  the  suggested 

cutoff  values  of  .62  for  DFFITS  and  .31  for  (DCOOK)1//z,  the  least  squares 
diagnostics  flag  only  points  12  and  22  while  the  R-diagnostics  flag  all  seven 

points.  Both  R-diagnostics  are  necessary;  for  instance  RDFFIT  and  ( RDCOOK ) 1  //£ 
flag  point  8  while  the  external  tR  is  at  1.84.  Conversely  the  external  tR 

flags  point  26  while  the  other  two  do  not. 

Table  2  displays  the  RDFBETAS.  Using  the  suggested  cutoff  value 
of  .31,  these  statistics  indicate  an  influential  effect  on  at  least  one  P  fur¬ 
tive  of  the  above  points  and  on  the  two  exceptions,  points  26  and  22,  the 
outcome  is  borderline.  Note  that  RDFFTT  is  large  for  p  at  point  11. 
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.41  though  this  point  was  not  flagged  abc  it  is  a  point  of  high  leverage; 
i.a.  h, l  >  2p/n. 


Note  that  in  both  residual  plots,  the  low  values  of  the  residuals  are 
bunched  together  while  the  higher  values  are  more  dispersed;  i.e.,  the 
distribution  of  the  residuals  appears  to  be  positively  skewed.  For  a  final 
fit,  then,  we  proceeded  to  use  the  bent  R-score  function  given  by 


<p(u) 


|  <  u  <  1 
0  <  u  <  | 


which  is  suited  for  positively  skewed  error  distributions  with  heavy  right 
tails;  see  McKean  and  Sievers  (1988)  for  a  discussion  of  these  scuies.  In  its 
residual  plot,  Figure  3,  the  outliers  stand  out  more  than  in  the  previous  fits 
and  it  does  appear  to  be  more  scattered  indicating  a  better  fit.  The 
regression  estimates  for  all  three  fits  appear  in  Table  3.  They  do  differ, 
especially  the  estimates  of  p^ .  Table  4  displays  the  diagnostics  for  the  bent 

score  fit.  Note  that  the  above  seven  points  are  flagged  sis  well  as  point  11. 


Example  2.  The  dataset,  of  this  example  is  the  stack-loss  data  presented  in 

Daniel  and  Wood  (1971,  p.60).  It  has  been  discussed  in  several  articles  on 
robust  methods,  for  instance,  Andrews  (1974)  and  Hettmansperger  and  McKean 
(1977).  In  the  latter  article,  robust  residuals  plots  are  presented  for  fits 
using  various  R-seores.  It  appears  from  these  plots  that  observations  1,  3, 

4,  and  21  are  outlying  points. 

In  Table  5  we  present,  the  diagnostic  measures  for  both  an  R  and  a  least 
squares  fit,  (the  R-fit  used  Wilcoxon  scores).  The  R-diagnosties  clearly 
indicate  that  these  points  need  further  investigation.  RDFFIT  exceeds 

2(p/n)  =  .87  on  all  4  of  these  points,  the  external  t  exceeds  2.0  on  all 

but  the  first  point  (but  even  here  it  is  at  1.91),  and  (RDCOOK)1^2  exceeds 

2/Vn  =  .44  on  points  1  and  21.  From  the  RDFBETA  values,  points  1  and  3  had  .-in 
impact  on  P  while  the  remaining  two  points  had  an  impact,  on  both  Pl  and  p  . 

None  of  the  R-diagnost i cs  for  the  remaining  17  points  exceeded  the  cutoff 
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values  . 

In  contrast,  for  least  squares,  only  observation  21  was  flagged  by  DFFITS 

while  observations  4  and  21  were  flagged  by  the  least  squares  external  t 
statistic.  The  regaining  two  were  not  flagged. 

5.  Conclusions 

Diagnostics  are  an  extremely  important  part  of  many  data  analyses. 

Least  squares  diagnostics  have  been  effective  in  detecting  and  identifying 
aberrant  cases.  These  methods  fit  most  naturally  with  least  squares  based 
inference.  Currently,  there  are  several  approaches  to  robust  inference  in  the 
linear  model.  The  present  paper  suggests  natural  diagnostic  quantities  to  be 
used  in  conjunction  with  robust  rank-based  inference.  The  robust  diagnostics 
appear  to  have  some  advantages.  In  the  examples,  they  were  able  to  flag 
cases  of  potential  trouble  that  were  passed  over  by  least  squares 
diagnostics. 

Appendix  A . 

In  this  appendix  we  derive  the  least  squares  diagnostic  tools  (internal 
and  external  t,  DFFITS,  DCOOK,  and  DFBETAS)  from  a  common  source  (the  mean 
shift  outlier  model).  We  also  establish  some  of  the  results  we  need  in  the 
derivation  of  the  R-diagnostics . 

Consider  the  linear  model, 

Y  =  Xb  +  e  ( A . 1 ) 

which  is  defined  in  Section  2.  The  mean  shift  outlier  model  for  the  it.h  data 
point  is  defined  by 

Y  =  Xb  +  d^0-  +  e  (A. 2) 

where  d^  is  a  n*l  vector  of  zeroes  except  its  it.h  component  is  1. 

The  parameters  0^,  i  =  l,...,n,  play  a  key  role  in  the  diagnostics. 

There  are  several  ways  of  writing  model  (A. 2).  Following  Cook  and 
Weisberg  (1982)  and  letting  d*  =  (I-H)d^,  where  H  is  X(X'X)-1X',  the  model  c:tn 
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be  written  as 


Y  =  Xb*  +  d*0i  +  e. 


(A. 3) 


w 

Since  X  and  d^  are  orthogonal,  the  least  squares  estimate  of  0^  is 


d*'Y  e 


0.  = 


-  -i  ~ 


LS ,  i 


1  d.'d. 

-l  -i 


t,A%  1— 


(A. 4) 


The  second  equality  holds  since  d*’d*  =  1-h^  and  d*’Y  -  d^1 (I-H)Y. 


Next  we  want  to  connect  0^  with  the  statistic  DFFIT^  which  is  the 
difference  in  the  fitted  value  of  Y^  at  the  model  (A.l)  and  when  the  ith 

point  is  deleted.  Let  Y, Q  be  the  fitted  value  of  Y-  at  model  (A.l)  and  let 
Y^g(i)  be  the  fitted  value  of  Y^  when  the  ith  point  is  deleted.  Then 

DFFiTi  =  W  = 

In  order  to  obtain  Y^gfi),  we  need  not  delete  the  point  and  refit  since  it 
follows  from  Cook  and  Weisberg  (1982,  p.33)  that 


hence , 


YLs(i)  =  Yi  -  V 


DFFITi  =  YlS(1  -  (Yi-0i) 


=  *  (1-hix)9i  +  0i 


(4.3) 


where  the  middle  equal  it. y  follows  from  (A.l). 

The  least  squares  diagnostics  follow  from  different  standardizations  of 
DFFTTj  .  For  the  t.-statistics  note  from  (A.l)  that, 

Variet)  =  <T2/(l-hH). 

If  we  standardize  DFFIT^  by  using  the  estimate  n  of  az  hased  on  model 
(A.l)  and  use  (A.l)  we  t.hen  get  the  infernal  t  statistic  given  by 
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DFFITi  =  % _  =  ^LS.t 

hii  (o/71-h-  )  o/yi-h—  o/l-hii 


( A .  6  ) 


This  is  called  the  internally  studentized  residual;  see  Cook  and  Weisberg 
(1982,  p. 18) . 

Next  suppose  we  standardize  DFFIT^  by  the  estimate  s2  (  i  )  of  a 2  based  on 

the  mean  shift  outlier  model  (A. 2).  As  derived  in  Cook  and  Weisberg  (1982, 
P-20) 


.  (n-p-1 )o2-e-2/( 1-h-  • ) 

s  (i)  = - F4^ - —  • 

The  corresponding  standai-dization  of  DFFIT^  is, 


(A. 7) 


DFFIT- 


hii(s(  ij/yl-h'^)  s(  ij/yi-h- 


(A.  8) 


=  i ) 

This  is  the  externally  studentized  residual;  see  Cook  arid  Weisberg 
(1982,  p.20).  This  is  also  t.he  t-statistie  for  testing  Hq:  9-^0  versus 

0*0-  in  model  A. 2. 

The  above  standardizations  of  DFFIT^  are  consequences  of  considering  it 

in  terms  of  0^.  Suppose  instead  we  standardize  it  in  terms  of  fitted  values. 
Note  that 

Var(Y-)  =  a2hu. 

If  we  standardize  DFFIT ^  by  using  <i2  as  our  estimate  of  a2  we  get 


DFFITi  0,/E ■  i 

•x - Z  - x - 


rLS,i'/l-h[i 
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-  v  p  DCOOK. 

These  equalities  follow  from  (A. 4)  and  (A. 6).  See  Cook  and  Weisberg 
(1982,  p.117). 

If  on  the  other  hand,  we  standardize  it  by  using  s2 ( i )  as  our  estimate 
of  a2  we  get 


DFBETA  =  (X'X)  1 X ' G . 
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These  can  be  obtained  quickly  using  the  QR-subroutines  in  LINPACK. 

As  with  DFFIT^,  we  can  standardize  DFBETA  in  several  ways.  The  one  we 

shall  note  here  is  to  use  s(i)  to  estimate  a.  This  leads  to 


DFBETS 


LSi.j 


bJ-bJ.(i) 


(DFBETA)  ■ 


lai. 


JJ 


which  a  diagnostic  proposed  by  Belsley  et  al.  (1980,  p.13) 


Appendix  B 

In  this  appendix,  we  develop  the  approximations  up  to  terms  of  order  n-1 
for  the  variance-co variance  matrices  Cov(eR)  and  Cov(yR) .  We  will 
concentrate  on  the  case  of  asymmetric  error  distributions  and  state  the 
results  in  the  symmetric  case.  We  will  use  the  notation  H  =  P  = 


XC(XCXC)  1Xc'  and  J  =  PJ  =  n  lQ  _T  )  along  with  h^c  the  ith  diagonal  element 


- 


of  Hc<  Then  the  leverage  of  the  ith  case  is  h-  =  n_1  +  h^c.  The  main 
results  are 


Cov(eR)  i  a2{I-KiJ-K2Hc) 


where  K;  =  (r*/o)2 (26*/r*-\) 


(B.  1  ) 
( B.  2 ) 


K„  =  (t/CT)2(25/t-1: 


<5  =  E( e^sgn  e^  ) 

<5  =  E[e-a(F(e^  )  )  ] 

o  is  the  error  variance  and  r  ,  r  defined  in  Section  2. 


Hence , 


Var  eR,i  =  o2n-Kin"1-K2hic). 


In  the  case  of  a  symmetric  error  distribution, 


( B.  3 ) 


Var  eR, i  =  °an-Kahi). 


( B.  4  ) 
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Recall  Cook  and  Weisberg  (1982,  p.ll),  that  in  the  least  squares  ease. 


Var  e,  G  •  =  a2(l-h;)  so  that  K  and  K  are  correction  factors  due  to  using 

L*J  i  1  1  12 

the  rank  score  fitting  algorithm. 

Likewise 


Cov(YR)  i  r*2J  +  t2H 


Var  YRi  =  n'^*2  +  hicr2 , 


(B.  5 ) 


while  in  the  symmetric  case, 

Var  YRi  =  h  j^t2  .  (B.6) 

Before  giving  a  sketch  of  the  derivations  of  these  formulas,  we  discuss  the 

estimation  of  the  pai-ameters  appearing  above.  Natural  estimates  of  &  and  6 
are 


i 

n-p 


n 

I 

i  =  l 


(R.  7 ) 


=  K=5di«r>'  IB-a> 

where  D(/?)  is  defined  in  (2.2).  Estimates  of  r*  and  ~  are  referenced  in 
Section  2. 

We  now  outline  an  approximation  for  Cov(eR),  the  variance-covariance 
matrix  of  eR,  the  vector  of  residuals.  Using  (2.4)  and  (2.8), 

eR  =  Y  -  i(a+r*n~ll  a*)  -  X„(j?+T(X^,JX^) 

where  a  =  a(R(e))  and  a*'  =  ( sgn  ei,...,sgn  e  ). 

Then 


Now  Ea 


•  *  r  *  ,, 

eR  =  §  -  r  Ja  -  -Ha. 
0  =  Ea*  and  hence 
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Cov(e^)  =  E(e  e'  -  2 r*e  a*'J  -  2-re  a'Hc 

+  -*2Ja*a* 1 J  +  2r*Vja*a'Hc 

+  r2H  a  a' H  } . 
c—  -  c 

Note  that  Ea  a'  =  I  =  Ea*a*' .  Further  Ee  a*'  =6*1,  Ee  a1  =61,  and  Ea*a'  = 
cl  for  a  constant  c.  Now  using  J'J  =  J,  H^Hc  =  Hc,  J'Hc  =  0  we  have 

Cov(eR)  =  a2 1  -  -  dj_t2(M  _  1)Hr>, 

Then  (B.l),  (B.2)  and  (E.3)  follow  iiisned lately.  The  formula  (B.5) 

follows  in  a  similar  fashion  from  Y  =  la  +  +  ~*Ja*  +  rH^a.  Simi lardy  for 

formulas  (B.4)  and  (B.fi). 
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Table  1.  Diagnostics  for  R  (Wilcoxon  Scores) 
and  Least  Squares  fit  of  Example  1. 


Case 

Int (R) 

Ext (R) 

RDFFITS 

RDC00K 

Int(LS) 

Ext(LS) 

DFFITS 

DC00K 

1 

0 

.72 

0 

.80 

0 

.20 

0. 

.  10 

0. 

.49 

0, 

.49 

0 

.  12 

0. 

.06 

2 

-0 

.79 

-1, 

.00 

-0 

.34 

-0. 

.  17 

-1, 

.  19 

-1, 

.20 

-0 

.40 

-0. 

.20 

3 

-0 

.  16 

-0 

.19 

-0 

.05 

-0. 

.03 

-0 

.53 

-0, 

.53 

-0 

.15 

-0. 

.07 

4 

-0 

.64 

-0 

.84 

-0 

.19 

-0 

.10 

-0. 

.96 

-0, 

.95 

-0 

.22 

-0. 

.11 

5 

0 

.69 

0 

.76 

0 

.28 

0 

.14 

0 

.42 

0 

.42 

0 

.  15 

0, 

.08 

6 

0 

.12 

0 

.13 

0 

.03 

0 

.02 

-0 

.21 

-0 

.21 

-0, 

.05 

•0, 

.02 

7 

-0 

.72 

-0 

.92 

-0 

.32 

-0 

.  17 

-1 

.07 

-1 

.07 

-0 

.37 

-0 

.  18 

8 

1 

.51 

1 

.84 

0 

.67 

0 

.32 

1 

.  16 

1 

.  17 

0 

.43 

0, 

.21 

9 

2 

.07 

2 

.69 

0 

.56 

0 

.26 

X  . 

.74 

1, 

.79 

0 

.38 

0. 

.  18 

10 

1 

.54 

1 

.94 

0 

.62 

0 

.29 

1 , 

.  12 

1. 

.  13 

0 

.36 

0. 

.  18 

11 

0 

.93 

1 

.05 

0 

.53 

0 

.25 

0, 

.56 

0, 

.56 

0 

.30 

0, 

.15 

12 

3 

.  18 

4 

.  13 

1 

.03 

0 

.47 

2, 

.84 

3, 

.  17 

0 

.79 

0, 

.35 

13 

-0 

.56 

-0 

.65 

-0 

.27 

-0 

.  14 

-0 

.77 

-0. 

.77 

-0. 

.35 

-0. 

.  18 

14 

0 

.00 

-0 

.00 

-0 

.00 

-0 

.00 

-0 

.34 

-0 

.34 

-0 

.09 

-0. 

.04 

18 

0 

.21 

0 

.28 

0 

.08 

0 

.04 

-0, 

.  12 

-0 

.  12 

-0 

.04 

-0. 

.02 

16 

0. 

.66 

0 

.78 

0 

.  18 

0, 

.09 

0 

.31 

0 

.31 

0 

.07 

0. 

.04 

17 

-0 

.03 

-0, 

.  11 

-0 

.02 

-0, 

.01 

-0 

.34 

-0 

.34 

-0. 

.07 

-0. 

.03 

18 

-0 

.72 

-0, 

.91 

-0 

.26 

-0 

.  14 

-1 

.12 

-1 

.  13 

-0. 

.32 

-0. 

.  16 

19 

-0. 

.43 

-0. 

56 

-0. 

,24 

-0. 

,  13 

-0 

.83 

-0, 

.82 

-0. 

.36 

-0. 

.  18 

20 

-0. 

.66 

-0. 

86 

-0. 

.30 

-0. 

,  16 

-1 , 

.00 

-1. 

.00 

-0. 

.35 

-0. 

.17 

21 

-0. 

,60 

-0. 

80 

-0. 

,13 

-0. 

07 

-0  . 

.90 

-0. 

.89 

-0. 

,  15 

-0. 

.07 

22 

2. 

43 

2. 

.80 

0. 

,61 

0. 

,30 

2. 

.26 

2. 

.40 

0. 

.53 

0. 

.25 

23 

0. 

16 

0. 

19 

0. 

03 

0. 

02 

-0. 

,11 

-0. 

.  10 

-0. 

.02 

-0. 

.01 

24 

-0. 

.72 

-0. 

90 

-0  . 

22 

-0  . 

,  12 

-1 . 

.09 

-1 . 

.09 

-0. 

.27 

-0  . 

.  13 

25 

-0. 

32 

-0. 

39 

-0. 

.  14 

-0  . 

07 

-0  . 

.49 

-0. 

.49 

-0. 

.  18 

-0. 

,09 

26 

1 . 

73 

2. 

06 

0. 

,43 

0. 

.21 

1 . 

,51 

1 . 

.53 

0. 

.32 

0. 

.  16 

27 

-0. 

76 

-0. 

96 

-0 

19 

-o . 

10 

-1 . 

.09 

-1 . 

.  10 

-0. 

.21 

-0. 

.11 

28 

0. 

47 

0. 

58 

0  . 

.  14 

0  . 

,07 

0. 

,24 

0. 

.24 

0, 

.06 

0. 

,03 

29 

1. 

00 

1 . 

27 

0  . 

.39 

0  . 

19 

0. 

.76 

0. 

.75 

0, 

.23 

0. 

.  12 

30 

-0. 

51 

-0. 

59 

-0. 

18 

-0. 

09 

-0  . 

.71 

-0. 

.70 

-0. 

,22 

-0. 

.11 

31 

-0. 

78 

-0. 

97 

-0. 

20 

-0. 

10 

-1 . 

.05 

-1 . 

,05 

-0. 

,22 

-0. 

.11 

32 

-0  . 

,19 

-0. 

24 

-0. 

07 

-0. 

04 

-0  . 

.36 

-0. 

,36 

-0. 

,11 

-0. 

.06 

33 

0. 

84 

1 . 

02 

0. 

,24 

0. 

12 

0. 

,57 

0. 

.57 

0. 

.  14 

0. 

.07 

34 

-0  . 

,58 

-0. 

68 

-0. 

.  15 

-0. 

08 

-0, 

,81 

-0. 

.81 

-0, 

,  18 

-0. 

.09 

35 

-0  , 

,69 

-0. 

86 

-0. 

,23 

-0. 

,  12 

-0, 

.98 

-0. 

.97 

-0. 

,26 

-0. 

.  13 

36 

-0, 

.02 

-0. 

13 

-0, 

,  13 

-0. 

.07 

0. 

,  18 

0. 

.  18 

0, 

.  19 

0. 

.09 

37 

0. 

,67 

0. 

.76 

0. 

.20 

0. 

.10 

0. 

.49 

0. 

.49 

0, 

.  13 

0. 

.06 

38 

1 . 

.64 

1 , 

.95 

0, 

.82 

0. 

,40 

1 . 

.27 

1 . 

.29 

0. 

,54 

0. 

,27 

39 

0. 

.89 

1 , 

.01 

0. 

.41 

0. 

.21 

0. 

.73 

0. 

.73 

0, 

.30 

0. 

,15 

40 

-0 

.30 

-0. 

.41 

-0 

.21 

-0. 

,11 

-0, 

.51 

-0. 

.50 

-0. 

,29 

-0. 

.  15 

41 

0. 

,07 

0, 

,  10 

0, 

.04 

0. 

02 

-0. 

,09 

-0. 

.08 

-0. 

03 

-0. 

02 

Table  2.  RDFBETAS  (Wilcoxon  Scores)  for  Example  1. 


Case 

Incep . 

Age 

1 

0 

.03 

-0 

.08 

2 

-0 

.09 

0 

.20 

3 

-0 

.04 

0 

.02 

4 

-0 

.13 

0 

.07 

5 

0 

.16 

-0 

,  11 

6 

0 

.02 

-0 

.02 

7 

-0 

.  16 

0 

.27 

8 

0 

.43 

-0 

.00 

9 

0 

.34 

-0 

.  14 

10 

0 

.47 

-0 

.17 

11 

0 

.  14 

-0 

.44 

12 

0 

.83 

-0 

.48 

13 

-0 

.09 

-0 

.01 

14 

-0 

.00 

0 

.00 

15 

0 

.06 

-0, 

.02 

16 

0 

.06 

-0 

.07 

17 

-0 

.01 

-0, 

.00 

18 

-0 

.09 

0, 

.09 

19 

-0. 

.02 

0, 

.  13 

20 

-0. 

.10 

-0, 

,  11 

21 

-0. 

.04 

0. 

.00 

22 

-0. 

.  16 

0, 

.04 

23 

0. 

,01 

-0. 

,01 

24 

-0  . 

.04 

0. 

,07 

25 

0. 

,01 

0. 

.06 

26 

-0  . 

,06 

-0, 

,  11 

27 

-0. 

02 

-0. 

02 

28 

-0  . 

08 

0. 

,05 

29 

-0  . 

.24 

0, 

,25 

30 

0. 

04 

-0. 

.11 

31 

0. 

08 

-0. 

12 

32 

0. 

,04 

-0  . 

01 

33 

-0  . 

,06 

0. 

16 

34 

0. 

07 

-0. 

10 

35 

0. 

04 

-0. 

16 

36 

0. 

,06 

-0, 

,01 

37 

-0  . 

.  14 

0. 

,09 

38 

-0. 

19 

0. 

32 

39 

-0. 

34 

0. 

28 

40 

0. 

,  12 

-0. 

22 

41 

-0. 

03 

0. 

02 

Weight  Skinfold 


-0 

.02 

0, 

.  11 

-0 

.25 

0 

.25 

0 

.02 

-0 

.00 

0, 

.06 

-0. 

.06 

-0, 

.09 

0. 

.  17 

0 

.00 

0 

.00 

-0 

.20 

0, 

.09 

-0. 

,49 

0. 

.21 

-0, 

,08 

-0 

.17 

-0 

.  19 

-0 

.  15 

0 

.49 

-0 

.27 

-0, 

.  15 

-0. 

.  18 

0, 

.20 

-0 

.26 

0 

.00 

-0 

.00 

-0 

.05 

0 

.03 

0, 

.09 

-0. 

.  13 

0 

.01 

-0 

.00 

-0 

.11 

0 

.20 

-0 

.21 

0 

.19 

0. 

.26 

-0 

.09 

0. 

.02 

0. 

.00 

0, 

.05 

0. 

.29 

0. 

.01 

0. 

,01 

-0. 

,  12 

0. 

.17 

-0, 

,07 

-0. 

.05 

0. 

,25 

-0. 

.06 

0. 

,00 

0. 

,08 

0. 

.05 

-0. 

.04 

0, 

02 

-0. 

.  14 

0. 

13 

-0. 

.  12 

0. 

,02 

0. 

03 

-0, 

.04 

0  . 

.00 

-0, 

11 

0. 

,00 

0. 

.05 

-0. 

.02 

0. 

.13 

-0. 

,01 

-0. 

,01 

-0. 

,  10 

0. 

,05 

-0  . 

.01 

0. 

08 

-0  . 

.57 

0. 

07 

-0. 

.07 

0. 

.14 

-0. 

.05 

0. 

.01 

-0. 

.01 

Table  3  Fits  for  Example  1, 

(standard  error  in  parentheses). 

Fit  Incep.  Age  Weight  SkinFold  Scale  a  or 

Least  Squares  1.70  (.327)  -.0021  (.003)  -.0152  (.005)  .2045  (.166)  .215 

R-Wilcoxon  1.49  (.273)  -.0011  (.003)  -.0154  (.004)  .2739  (.137)  .178 

R- Bent  Score  1.43J.247)  -.0009  (.002)  -.0152  (.004)  .3079  (.124)  .159 


Table  4.  Diagnostics  for  R  (Bent  Scores) 
for  Example  1. 


ase 

Int(R) 

Ext (R) 

RDFFITS 

RDC00K 

1 

0.67 

1.07 

0.27 

0.14 

2 

-0.77 

-0.97 

-0.33 

-0.17 

3 

-0.14 

-0.13 

-0.04 

-0.02 

4 

1 

o 

-0.77 

-0.17 

-0.09 

5 

0.64 

0.97 

0.35 

0.18 

6 

0.11 

0.32 

0.07 

0.04 

7 

-0.72 

-0.94 

-0.32 

-0.16 

8 

1.53 

2.16 

0.79 

0.38 

9 

2.09 

3.14 

0.66 

0.30 

10 

1.58 

2.57 

0.82 

0.35 

11 

0.89 

1.13 

0.61 

0.24 

12 

3.20 

5.20 

1.29 

0.53 

13 

-0.61 

-0.71 

-0.33 

-0.17 

14 

0.01 

0.18 

0.05 

0.02 

15 

0.22 

0.46 

0.13 

0.07 

16 

0.67 

1.07 

0.25 

0.13 

17 

-0.03 

0.13 

0.03 

0.01 

18 

-0.68 

-0.88 

-0.25 

-0.13 

19 

-0.41 

-0.52 

-0.23 

-0.12 

20 

-0.63 

-0.80 

-0.28 

-0.15 

21 

-0.61 

-0.73 

-0.12 

-0.06 

22 

2.35 

3.30 

0.72 

0.34 

23 

0.13 

0.34 

0.06 

0.03 

24 

-0.71 

-0.89 

-0.22 

-0.12 

25 

-0.41 

-0.52 

-0.19 

-0.09 

26 

1.69 

2.43 

0.51 

0.24 

27 

-0.75 

1 

o 

< o 
to 

-0.18 

-0.09 

28 

0.43 

0.80 

0.19 

0.09 

29 

0.97 

1.48 

0.46 

0.21 

30 

-0.57 

-0.64 

-0.20 

-0.11 

31 

-0.80 

-0.95 

-0.20 

-0.11 

32 

-0.27 

-0.34 

-0.10 

-0.05 

33 

0.83 

1.35 

0.32 

0.15 

34 

-0.61 

-0.71 

-0.16 

-0.09 

35 

-0.69 

-0.85 

-0.22 

-0.12 

36 

-0.35 

-0.39 

-0.39 

-0.19 

37 

0.61 

1.09 

0.28 

0.13 

38 

1.67 

2.46 

1.04 

0.46 

39 

0.81 

1.39 

0.57 

0.27 

40 

-0.32 

-0.36 

-0.19 

-0.09 

41 

0.00 

0.21 

0.08 

0.04 

Table  5.  Diagnostics  for  R  (Wilcoxon  Scores) 
and  Least  Squares  fit  of  Example  .2. 


Case 

Int(R) 

Ext(R) 

RDFFITS 

RDC00K 

Int(LS) 

Ext(LS) 

3FFITS 

DC00K 

1 

1.42 

1.93 

1.28 

0.45 

1.19 

1.21 

0.79 

0.39 

2 

-0.50 

-0.69 

-0.48 

-0.23 

-0.72 

-0.71 

-0.48 

-0.24 

3 

1.61 

2.13 

1.02 

0.37 

1.55 

1.62 

0.74 

0.36 
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Figure  1 .  Residual  plot  for  LS  fit  of  Example  1 
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Figure  2.  Residual  plot  for  Wilcoxon  fit  of  Example  1 
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Figure  3.  Residual  plot  for  Bent  Score  fit  of  Example  1 
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